For today's exercise, finding some interesting data. Little bit of googling around took me to this page: https://www.dataquest.io/blog/free-datasets-for-projects/ and, mostly at random, decided to look into the World Bank (#12)
This one looked like an interesting first dataset: http://data.worldbank.org/data-catalog/world-development-indicators
The World Bank has a detailed API, and Oliver Sherouse has been kind enough to write a Python wrapper around it. http://wbdata.readthedocs.io/en/latest/. After pip install wbdata ...
In [1]:
import wbdata
In [2]:
%matplotlib inline
In [3]:
wbdata.get_source() # List world bank data sources
In [4]:
# List all available indicators from that source. Very long list. Nicely scrolled in local notebook, but
# overwhelming on github cache
# wbdata.get_indicator(source=2)
In [5]:
# wbdata.get_data("EG.USE.PCAP.KG.OE") # Returns long list of dictionary objects
wbdata.get_indicator("EG.USE.PCAP.KG.OE")
In [6]:
energy_use = wbdata.get_dataframe({"EG.USE.PCAP.KG.OE": "Energy Use"})["Energy Use"]
# Now, what I've lost here is metadata that makes my units immediately interpretable
# But beautiful that you get index alignment
In [7]:
energy_use.head()
Out[7]:
In [8]:
type(energy_use)
Out[8]:
In [9]:
energy_use.index.get_level_values(0).unique()
Out[9]:
In [10]:
# Selecting from levels of multi-index
# energy_use.xs('2012', level="date")
energy_use["United States"].sort_index().plot()
Out[10]:
Energy use is a Series with a multi-index. To use plotting, I want the countries as columns of a dataframe. unstack does that for me, and then the built-in plot will show selected items in a single plot. Interesting parameters of plot
In [11]:
energy_use.unstack(0)[["Euro area", "United States", "Japan", "Iceland", "Trinidad and Tobago"]].plot()
#energy_use.unstack(0)[["Euro area", "United States", "Japan"]].plot(subplots=True, sharex=True,
# sharey=True, layout=(2,2), figsize=(12,6))
Out[11]:
In [12]:
import warnings
warnings.filterwarnings('ignore')
top_users = energy_use.xs('2014', level="date").sort_values(ascending=False)
In [13]:
top_users[:50].sort_values(ascending=True).plot.barh(figsize=(4,10))
Out[13]:
What's going on with Trinidad & Tobago? Obviously, small nation. In absolute terms this isn't going to be a big deal, but relative it pops to the top of the list. I'm not the only one who's asked. https://www.reddit.com/r/geography/comments/3kxv4u/why_does_trinidad_and_tobago_have_such_high/
From that thread:
Trinidad and Tobago is actually a surprisingly rich, industrialized nation. Their economy relies heavily on petrochemicals and this eats up a lot of energy. If you go down to page 14 on this report on renewable energy by the government you can see a pie chart breaking down the electricity demand by sector. The industrial sector takes up about 61% of the nations energy.
smectite
Does the world bank give us the ability to break down energy consumption by sector? Interesting to look at residential, splitting out industry. See who's actually living environmentally.
Starting to look around for by sector sources
In [ ]: